Provenance Capture and Use: A Practical Guide
نویسندگان
چکیده
There is a widespread recognition across MITRE’s sponsors of the importance of capturing the provenance of information (sometimes called lineage or pedigree). However, the technology for supporting capture and usage of provenance is relatively immature. While there has been much research, few commercial capabilities exist. In addition, there is neither a commonly understood concept of operations nor established best practices for how to capture and use provenance information in a consistent and principled way. This document captures lessons learned from the IM-PLUS project, which is prototyping a provenance capability that synthesizes prior research; the project is also applying the prototype to government application scenarios spanning defense, homeland security, and bio-surveillance. We describe desirable features of a provenance capability and trade-offs among alternate provenance approaches. The target audience is systems engineers and information architects advising our sponsors on how to improve their information management.
منابع مشابه
Provenance Datasets Highlighting Capture Disparities
Provenance information is inherently affected by the method of its capture. Different capture mechanisms create very different provenance graphs. In this work, we describe an academic use case that has corollaries in offices everywhere. We also describe two distinct possibilities for provenance capture methods within this domain. We generate three data sets using these two capture methods: the ...
متن کاملProvenance Capture Disparities Highlighted through Datasets
Provenance information is inherently affected by the method of its capture. Different capture mechanisms create very different provenance graphs. In this work, we describe an academic use case that has corollaries in offices everywhere. We also describe two distinct possibilities for provenance capture methods within this domain. We generate three datasets using these two capture methods: the c...
متن کاملOn the Use of Abstract Workflows to Capture Scientific Process Provenance
Capturing provenance about artifacts produced by distributed scientific processes is a challenging task. For example, one approach to facilitate the execution of a scientific process in distributed environments is to break down the process into components and to create workflow specifications to orchestrate the execution of these components. However, capturing provenance in such an environment,...
متن کاملIntegrating Approximate Summarization with Provenance Capture
How to use provenance to explain why a query returns a result or why a result is missing has been studied extensively. Recently, we have demonstrated how to uniformly answer these types of provenance questions for first-order queries with negation and have presented an implementation of this approach in our PUG (Provenance Unification through Graphs) system. However, for realisticallysized data...
متن کاملProvenance and Case-Based Reasoning
Computational science takes a multidisciplinary approach to scientific investigation, tightly linking scientific research with computational studies and processes such as numerical simulation, data management, and visualization to study complex phenomena such as weather systems. The scientific importance of such processes has led to significant interest in recording the provenance of the data p...
متن کامل